Posters IMPUTATION OF MISSING GENOTYPES IN HIGH DENSITY SNP DATA
نویسندگان
چکیده
The accuracy and computational complexity of five methods to impute missing genotypes in high density SNP data was investigated. The haplotype reconstruction package fastPHASE reached the highest accuracies (91% to 98%) for varying proportions (0.2% to 8%) of missing genotypes. Alternative methods based on principal component analysis were less accurate (67% to 94%), but their computational demand was an order of magnitude lower.
منابع مشابه
Imputation of missing single nucleotide polymorphism genotypes using a multivariate mixed model framework.
The objective of this paper was to investigate, for various scenarios at low and high marker density, the accuracy of imputing genotypes when using a multivariate mixed model framework using information from 2, 4, or 10 surrounding markers. This model predicts genotypes at a locus, using genotypes at nearby loci as correlated traits, and the additive genetic relationship matrix to use informati...
متن کاملImputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method
The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...
متن کاملEffect of reference population size and available ancestor genotypes on imputation of Mexican Holstein genotypes.
The effects of reference population size and the availability of information from genotyped ancestors on the accuracy of imputation of single nucleotide polymorphisms (SNP) were investigated for Mexican Holstein cattle. Three scenarios for reference population size were examined: (1) a local population of 2,011 genotyped Mexican Holsteins, (2) animals in scenario 1 plus 866 Holsteins in the US ...
متن کاملComparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study
BACKGROUND Genotype imputation is an important process of predicting unknown genotypes, which uses reference population with dense genotypes to predict missing genotypes for both human and animal genetic variations at a low cost. Machine learning methods specially boosting methods have been used in genetic studies to explore the underlying genetic profile of disease and build models capable of ...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کامل